我们处理与混合倡议的会话搜索方案:即用户询问系统答案,以及系统询问(澄清问题)和用户答案。我们专注于选择下一个澄清问题的任务,给定对话上下文。我们的方法利用通道检索,用于初始选择相关候选澄清问题,以及微调两个深度学习模型,用于重新排名这些候选人。我们在两种不同用例中评估了我们的方法。第一个是在大型Web集合中的开放式域会话搜索。第二个是面向任务的客户支持设置。我们展示我们的方法在两个使用情况下表现良好。
translated by 谷歌翻译
对绝对姿势回归剂(APR)网络进行训练,以估计给定捕获图像的相机姿势。他们计算了摄像机位置和方向回归的潜在图像表示。与提供最新精度的基于结构的本地化方案相比,APRS在本地化精度,运行时和内存之间提供了不同的权衡。在这项工作中,我们介绍了相机姿势自动编码器(PAE),多层感知器通过教师学生的方法进行培训,以用APR作为老师来编码相机姿势。我们表明,由此产生的潜在姿势表示可以密切复制APR性能,并证明其对相关任务的有效性。具体而言,我们提出了一个轻巧的测试时间优化,其中最接近火车的姿势编码并用于完善摄像头位置估计。该过程在剑桥大标记和7Scenes基准上都达到了APRS的新最新位置精度。我们还表明,可以从学到的姿势编码中重建火车图像,为以低内存成本以较低的存储器成本整合火车的视觉信息铺平了道路。我们的代码和预培训模型可在https://github.com/yolish/camera-pose-auto-coders上找到。
translated by 谷歌翻译
Directed information (DI) is a fundamental measure for the study and analysis of sequential stochastic models. In particular, when optimized over input distributions it characterizes the capacity of general communication channels. However, analytic computation of DI is typically intractable and existing optimization techniques over discrete input alphabets require knowledge of the channel model, which renders them inapplicable when only samples are available. To overcome these limitations, we propose a novel estimation-optimization framework for DI over discrete input spaces. We formulate DI optimization as a Markov decision process and leverage reinforcement learning techniques to optimize a deep generative model of the input process probability mass function (PMF). Combining this optimizer with the recently developed DI neural estimator, we obtain an end-to-end estimation-optimization algorithm which is applied to estimating the (feedforward and feedback) capacity of various discrete channels with memory. Furthermore, we demonstrate how to use the optimized PMF model to (i) obtain theoretical bounds on the feedback capacity of unifilar finite-state channels; and (ii) perform probabilistic shaping of constellations in the peak power-constrained additive white Gaussian noise channel.
translated by 谷歌翻译
Participants in political discourse employ rhetorical strategies -- such as hedging, attributions, or denials -- to display varying degrees of belief commitments to claims proposed by themselves or others. Traditionally, political scientists have studied these epistemic phenomena through labor-intensive manual content analysis. We propose to help automate such work through epistemic stance prediction, drawn from research in computational semantics, to distinguish at the clausal level what is asserted, denied, or only ambivalently suggested by the author or other mentioned entities (belief holders). We first develop a simple RoBERTa-based model for multi-source stance predictions that outperforms more complex state-of-the-art modeling. Then we demonstrate its novel application to political science by conducting a large-scale analysis of the Mass Market Manifestos corpus of U.S. political opinion books, where we characterize trends in cited belief holders -- respected allies and opposed bogeymen -- across U.S. political ideologies.
translated by 谷歌翻译
While the rollout of the fifth-generation mobile network (5G) is underway across the globe with the intention to deliver 4K/8K UHD videos, Augmented Reality (AR), and Virtual Reality (VR) content to the mass amounts of users, the coverage and throughput are still one of the most significant issues, especially in the rural areas, where only 5G in the low-frequency band are being deployed. This called for a high-performance adaptive bitrate (ABR) algorithm that can maximize the user quality of experience given 5G network characteristics and data rate of UHD contents. Recently, many of the newly proposed ABR techniques were machine-learning based. Among that, Pensieve is one of the state-of-the-art techniques, which utilized reinforcement-learning to generate an ABR algorithm based on observation of past decision performance. By incorporating the context of the 5G network and UHD content, Pensieve has been optimized into Pensieve 5G. New QoE metrics that more accurately represent the QoE of UHD video streaming on the different types of devices were proposed and used to evaluate Pensieve 5G against other ABR techniques including the original Pensieve. The results from the simulation based on the real 5G Standalone (SA) network throughput shows that Pensieve 5G outperforms both conventional algorithms and Pensieve with the average QoE improvement of 8.8% and 14.2%, respectively. Additionally, Pensieve 5G also performed well on the commercial 5G NR-NR Dual Connectivity (NR-DC) Network, despite the training being done solely using the data from the 5G Standalone (SA) network.
translated by 谷歌翻译
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions. For instance, OT is a popular loss function that quantifies the discrepancy between an empirical distribution and a parametric model. Recently, an entropic penalty term and the celebrated Sinkhorn algorithm have been commonly used to approximate the original OT in a computationally efficient way. However, since the Sinkhorn algorithm runs a projection associated with the Kullback-Leibler divergence, it is often vulnerable to outliers. To overcome this problem, we propose regularizing OT with the \beta-potential term associated with the so-called $\beta$-divergence, which was developed in robust statistics. Our theoretical analysis reveals that the $\beta$-potential can prevent the mass from being transported to outliers. We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers. In addition, our proposed method can successfully detect outliers from a contaminated dataset
translated by 谷歌翻译
Body Mass Index (BMI), age, height and weight are important indicators of human health conditions, which can provide useful information for plenty of practical purposes, such as health care, monitoring and re-identification. Most existing methods of health indicator prediction mainly use front-view body or face images. These inputs are hard to be obtained in daily life and often lead to the lack of robustness for the models, considering their strict requirements on view and pose. In this paper, we propose to employ gait videos to predict health indicators, which are more prevalent in surveillance and home monitoring scenarios. However, the study of health indicator prediction from gait videos using deep learning was hindered due to the small amount of open-sourced data. To address this issue, we analyse the similarity and relationship between pose estimation and health indicator prediction tasks, and then propose a paradigm enabling deep learning for small health indicator datasets by pre-training on the pose estimation task. Furthermore, to better suit the health indicator prediction task, we bring forward Global-Local Aware aNd Centrosymmetric Encoder (GLANCE) module. It first extracts local and global features by progressive convolutions and then fuses multi-level features by a centrosymmetric double-path hourglass structure in two different ways. Experiments demonstrate that the proposed paradigm achieves state-of-the-art results for predicting health indicators on MoVi, and that the GLANCE module is also beneficial for pose estimation on 3DPW.
translated by 谷歌翻译
Existing training criteria in automatic speech recognition(ASR) permit the model to freely explore more than one time alignments between the feature and label sequences. In this paper, we use entropy to measure a model's uncertainty, i.e. how it chooses to distribute the probability mass over the set of allowed alignments. Furthermore, we evaluate the effect of entropy regularization in encouraging the model to distribute the probability mass only on a smaller subset of allowed alignments. Experiments show that entropy regularization enables a much simpler decoding method without sacrificing word error rate, and provides better time alignment quality.
translated by 谷歌翻译
Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings. In most cases, the estimated distribution sums to 1 over all finite strings. However, in some pathological cases, probability mass can ``leak'' onto the set of infinite sequences. In order to characterize the notion of leakage more precisely, this paper offers a measure-theoretic treatment of language modeling. We prove that many popular language model families are in fact tight, meaning that they will not leak in this sense. We also generalize characterizations of tightness proposed in previous works.
translated by 谷歌翻译
The intersection of ground reaction forces in a small, point-like area above the center of mass has been observed in computer simulation models and human walking experiments. This intersection point is often called a virtual pivot point (VPP). With the VPP observed so ubiquitously, it is commonly assumed to provide postural stability for bipedal walking. In this study, we challenge this assumption by questioning if walking without a VPP is possible. Deriving gaits with a neuromuscular reflex model through multi-stage optimization, we found stable walking patterns that show no signs of the VPP-typical intersection of ground reaction forces. We, therefore, conclude that a VPP is not necessary for upright, stable walking. The non-VPP gaits found are stable and successfully rejected step-down perturbations, which indicates that a VPP is not primarily responsible for locomotion robustness or postural stability. However, a collision-based analysis indicates that non-VPP gaits increased the potential for collisions between the vectors of the center of mass velocity and ground reaction forces during walking, suggesting an increased mechanical cost of transport. Although our computer simulation results have yet to be confirmed through experimental studies, they already strongly challenge the existing explanation of the VPP's function and provide an alternative explanation.
translated by 谷歌翻译